Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
糖尿病性视网膜病变(DR)是发达国家工人衰老人群中失明的主要原因之一,这是由于糖尿病的副作用降低了视网膜的血液供应。深度神经网络已被广泛用于自动化系统中,以在眼底图像上进行DR分类。但是,这些模型需要大量带注释的图像。在医疗领域,专家的注释昂贵,乏味且耗时。结果,提供了有限数量的注释图像。本文提出了一种半监督的方法,该方法利用未标记的图像和标记的图像来训练一种检测糖尿病性视网膜病的模型。提出的方法通过自我监督的学习使用无监督的预告片,然后使用一小部分标记的图像和知识蒸馏来监督微调,以提高分类任务的性能。在Eyepacs测试和Messidor-2数据集中评估了此方法,仅使用2%的Eyepacs列车标记图像,分别使用0.94和0.89 AUC。
translated by 谷歌翻译
核密度估计(KDE)是使用最广泛的非参数密度估计方法之一。它是一种基于内存的方法,即它将整个培训数据集用于预测,这使其不适合大多数当前的大数据应用程序。已经提出了几种策略,例如基于树或基于哈希的估计器,以提高内核密度估计方法的效率。新型密度内核密度估计方法(DMKDE)使用密度矩阵,量子机械形式主义和随机傅立叶特征(显式内核近似)来产生密度估计。该方法的根源在KDE中,可以被视为近似方法,而无需基于内存的限制。在本文中,我们系统地评估了新型DMKDE算法,并将其与其他最新的快速程序进行比较,以近似于不同合成数据集的内核密度估计方法。我们的实验结果表明,在高维数据上执行时,显示了DMKDE与其竞争对手的计算密度估计和优势相提并论。我们将所有代码作为开源软件存储库提供。
translated by 谷歌翻译
密度估计是统计和机器学习应用中的基本任务。内核密度估计是低维度非参数密度估计的强大工具;但是,其性能在更高的维度上很差。此外,其预测复杂性量表与更多的培训数据点线性线性。本文提出了一种神经密度估计的方法,可以看作是一种核密度估计的一种,但没有高预测计算复杂性。该方法基于密度矩阵,一种用于量子力学的形式主义和自适应傅立叶特征。可以在没有优化的情况下对该方法进行培训,但也可以与深度学习体系结构集成并使用梯度下降进行训练。因此,它可以看作是神经密度估计方法的一种形式。该方法在不同的合成和实际数据集中进行了评估,其性能与最新的神经密度估计方法进行了比较,从而获得了竞争结果。
translated by 谷歌翻译
神经量子状态是通过人工神经网络参数化的变异波函数,这是一种数学模型,在机器学习社区中数十年。在多体物理学的背景下,诸如具有神经量子状态的变异蒙特卡洛作为变异波函数之类的方法在近似精确的近似性方面是成功的,即量子哈密顿量的基础。但是,提出神经网络体系结构的所有困难,以及探索其表现力和训练性,都渗透到其作为神经量子状态的应用。在本文中,我们考虑了Feynman-Kitaev Hamiltonian的横向场模型,该模型的基态编码在离散时间步骤下旋转链的时间演变。我们展示了该基础状态问题如何特别挑战神经量子状态的训练性,因为时间步骤的增加,因为真实的基态变得更加纠缠,并且概率分布开始遍及希尔伯特空间。我们的结果表明,所考虑的神经量子状态能够准确地近似系统的真实基态,即它们具有足够的表现。然而,广泛的超参数调整实验表明,经验事实是,在变化的蒙特卡洛设置中,训练性较差 - 可以防止对真实基态的忠实近似。
translated by 谷歌翻译
密度矩阵描述了量子系统的统计状态。它是一种强大的形式主义,代表量子系统的量子和经典不确定性,并表达不同的统计操作,例如测量,系统组合和期望作为线性代数操作。本文探讨了密度矩阵如何用作构建块,以构建机器学习模型,利用它们直接组合线性代数和概率的能力。本文的主要结果之一是表示与随机傅里叶功能耦合的密度矩阵可以近似任意概率分布超过$ \ mathbb {r} ^ n $。基于此发现,该纸张为密度估计,分类和回归构建了不同的模型。这些模型是可疑的,因此可以将它们与其他可分辨率的组件(例如深度学习架构)集成,并使用基于梯度的优化来学习其参数。此外,本文提出了基于估计和模型平均的优化培训策略。该模型在基准任务中进行评估,并报告并讨论结果。
translated by 谷歌翻译
Recent years have seen a proliferation of research on adversarial machine learning. Numerous papers demonstrate powerful algorithmic attacks against a wide variety of machine learning (ML) models, and numerous other papers propose defenses that can withstand most attacks. However, abundant real-world evidence suggests that actual attackers use simple tactics to subvert ML-driven systems, and as a result security practitioners have not prioritized adversarial ML defenses. Motivated by the apparent gap between researchers and practitioners, this position paper aims to bridge the two domains. We first present three real-world case studies from which we can glean practical insights unknown or neglected in research. Next we analyze all adversarial ML papers recently published in top security conferences, highlighting positive trends and blind spots. Finally, we state positions on precise and cost-driven threat modeling, collaboration between industry and academia, and reproducible research. We believe that our positions, if adopted, will increase the real-world impact of future endeavours in adversarial ML, bringing both researchers and practitioners closer to their shared goal of improving the security of ML systems.
translated by 谷歌翻译
When simulating soft robots, both their morphology and their controllers play important roles in task performance. This paper introduces a new method to co-evolve these two components in the same process. We do that by using the hyperNEAT algorithm to generate two separate neural networks in one pass, one responsible for the design of the robot body structure and the other for the control of the robot. The key difference between our method and most existing approaches is that it does not treat the development of the morphology and the controller as separate processes. Similar to nature, our method derives both the "brain" and the "body" of an agent from a single genome and develops them together. While our approach is more realistic and doesn't require an arbitrary separation of processes during evolution, it also makes the problem more complex because the search space for this single genome becomes larger and any mutation to the genome affects "brain" and the "body" at the same time. Additionally, we present a new speciation function that takes into consideration both the genotypic distance, as is the standard for NEAT, and the similarity between robot bodies. By using this function, agents with very different bodies are more likely to be in different species, this allows robots with different morphologies to have more specialized controllers since they won't crossover with other robots that are too different from them. We evaluate the presented methods on four tasks and observe that even if the search space was larger, having a single genome makes the evolution process converge faster when compared to having separated genomes for body and control. The agents in our population also show morphologies with a high degree of regularity and controllers capable of coordinating the voxels to produce the necessary movements.
translated by 谷歌翻译
We describe a Physics-Informed Neural Network (PINN) that simulates the flow induced by the astronomical tide in a synthetic port channel, with dimensions based on the Santos - S\~ao Vicente - Bertioga Estuarine System. PINN models aim to combine the knowledge of physical systems and data-driven machine learning models. This is done by training a neural network to minimize the residuals of the governing equations in sample points. In this work, our flow is governed by the Navier-Stokes equations with some approximations. There are two main novelties in this paper. First, we design our model to assume that the flow is periodic in time, which is not feasible in conventional simulation methods. Second, we evaluate the benefit of resampling the function evaluation points during training, which has a near zero computational cost and has been verified to improve the final model, especially for small batch sizes. Finally, we discuss some limitations of the approximations used in the Navier-Stokes equations regarding the modeling of turbulence and how it interacts with PINNs.
translated by 谷歌翻译
Reinforcement learning allows machines to learn from their own experience. Nowadays, it is used in safety-critical applications, such as autonomous driving, despite being vulnerable to attacks carefully crafted to either prevent that the reinforcement learning algorithm learns an effective and reliable policy, or to induce the trained agent to make a wrong decision. The literature about the security of reinforcement learning is rapidly growing, and some surveys have been proposed to shed light on this field. However, their categorizations are insufficient for choosing an appropriate defense given the kind of system at hand. In our survey, we do not only overcome this limitation by considering a different perspective, but we also discuss the applicability of state-of-the-art attacks and defenses when reinforcement learning algorithms are used in the context of autonomous driving.
translated by 谷歌翻译